AITopics | matrix-vector product

Frank-Wolfe-based Algorithms for Approximating Tyler's M-estimator

Neural Information Processing SystemsApr-24-2026, 19:56:52 GMT

Tyler's M-estimator is a well known procedure for robust and heavy-tailed covariance estimation. Tyler himself suggested an iterative fixed-point algorithm for computing his estimator however, it requires super-linear (in the size of the data) runtime per iteration, which maybe prohibitive in large scale. In this work we propose, to the best of our knowledge, the first Frank-Wolfe-based algorithms for computing Tyler's estimator. One variant uses standard Frank-Wolfe steps, the second also considers away-steps (AFW), and the third is a geodesic version of AFW (GAFW). AFW provably requires, up to a log factor, only linear time per iteration, while GAFW runs in linear time (up to a log factor) in a large n(number of data-points) regime. All three variants are shown to provably converge to the optimal solution with sublinear rate, under standard assumptions, despite the fact that the underlying optimization problem is not convex nor smooth. Under an additional fairly mild assumption, that holds with probability 1 when the (normalized) data-points are i.i.d.

artificial intelligence, iteration, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Add feedback

neurips_camera_ready

Anthony Caterini

Neural Information Processing SystemsFeb-19-2026, 12:34:36 GMT

likelihood, manifold, neural information processing system, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Gradients of Functions of Large Matrices

Neural Information Processing SystemsFeb-14-2026, 00:07:29 GMT

Tuning scientific and probabilistic machine learning models - for example, partial differential equations, Gaussian processes, or Bayesian neural networks - often relies on evaluating functions of matrices whose size grows with the data set or the number of parameters. While the state-of-the-art for evaluating these quantities is almost always based on Lanczos and Arnoldi iterations, the present work is the first to explain how to differentiate these workhorses of numerical linear algebra efficiently. To get there, we derive previously unknown ad-joint systems for Lanczos and Arnoldi iterations, implement them in JAX, and show that the resulting code can compete with Diffrax when it comes to differentiating PDEs, GPyTorch for selecting Gaussian process models and beats standard factorisation methods for calibrating Bayesian neural networks. All this is achieved without any problem-specific code optimisation.

artificial intelligence, iteration, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems

Yair Carmon, John C. Duchi

Neural Information Processing SystemsFeb-12-2026, 14:33:07 GMT

Neural Information Processing Systems http://nips.cc/

condition number, matrix-vector product, trust-region problem, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Faster Linear Algebra for Distance Matrices

Neural Information Processing SystemsFeb-12-2026, 13:27:04 GMT

Such complexities are prohibitive for scaling to large datasets.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

fcdf698a5d673435e0a5a6f9ffea05ca-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 01:10:31 GMT

matrix, matrix-vector product, variance, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

1787533e171dcc8549cc2eb5a4840eec-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 16:15:02 GMT

algorithm 1, iteration, variant, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Faster Linear Algebra for Distance Matrices

Neural Information Processing SystemsDec-25-2025, 14:26:00 GMT

The distance matrix of a dataset $X$ of $n$ points with respect to a distance function $f$ represents all pairwise distances between points in $X$ induced by $f$. Due to their wide applicability, distance matrices and related families of matrices have been the focus of many recent algorithmic works. We continue this line of research and take a broad view of algorithm design for distance matrices with the goal of designing fast algorithms, which are specifically tailored for distance matrices, for fundamental linear algebraic primitives. Our results include efficient algorithms for computing matrix-vector products for a wide class of distance matrices, such as the $\ell_1$ metric for which we get a linear runtime, as well as an $\Omega(n^2)$ lower bound for any algorithm which computes a matrix-vector product for the $\ell_{\infty}$ case, showing a separation between the $\ell_1$ and the $\ell_{\infty}$ metrics. Our upper bound results in conjunction with recent works on the matrix-vector query model have many further downstream applications, including the fastest algorithm for computing a relative error low-rank approximation for the distance matrix induced by $\ell_1$ and $\ell_2^2$ functions and the fastest algorithm for computing an additive error low-rank approximation for the $\ell_2$ metric, in addition to applications for fast matrix multiplication among others. We also give algorithms for constructing distance matrices and show that one can construct an approximate $\ell_2$ distance matrix in time faster than the bound implied by the Johnson-Lindenstrauss lemma.

algorithm, distance matrix, matrix, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.55)

Add feedback

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems

Yair Carmon, John C. Duchi

Neural Information Processing SystemsNov-20-2025, 15:37:50 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, condition number, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Accelerating Natural Gradient Descent for PINNs with Randomized Nyström Preconditioning

Bioli, Ivan, Marcati, Carlo, Sangalli, Giancarlo

arXiv.org Artificial IntelligenceOct-16-2025

Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs), such as Physics-Informed Neural Networks (PINNs). However, its practical use is often limited by the high computational cost of solving linear systems involving the Gramian matrix. While matrix-free NGD methods based on the conjugate gradient (CG) method avoid explicit matrix inversion, the ill-conditioning of the Gramian significantly slows the convergence of CG. In this work, we extend matrix-free NGD to broader classes of problems than previously considered and propose the use of Randomized Nyström preconditioning to accelerate convergence of the inner CG solver. The resulting algorithm demonstrates substantial performance improvements over existing NGD-based methods and other state-of-the-art optimizers on a range of PDE problems discretized using neural networks.

artificial intelligence, iteration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.11638

Country: